从单个下雨的图像中取出雨ste的人是一项挑战,因为雨牛排在多雨的图像上在空间上有所不同。本文通过结合常规图像处理技术和深度学习技术来研究此问题。提出了改进的加权引导图像过滤器(IWGIF),以从多雨图像中提取高频信息。高频信息主要包括雨牛排和噪音,它可以指导雨牛排意识到深度卷积神经网络(RSADCNN),以更多地注意雨牛排。RSADNN的效率和解释能力得到了提高。实验表明,就定性和定量测量而言,所提出的算法在合成和现实世界图像上都显着优于合成和现实世界图像的最先进方法。它对于在雨季中的自主导航很有用。
translated by 谷歌翻译
基于模型的单图像去悬算算法恢复了带有尖锐边缘的无雾图像和真实世界的朦胧图像的丰富细节,但以低psnr和ssim值的牺牲来为合成朦胧的图像。数据驱动的图像恢复具有高PSNR和SSIM值的无雾图图像,用于合成朦胧的图像,但对比度低,甚至对于现实世界中的朦胧图像而言,甚至剩下的雾霾。在本文中,通过组合基于模型和数据驱动的方法来引入一种新型的单图像飞行算法。传输图和大气光都是首先通过基于模型的方法估算的,然后通过基于双尺度生成对抗网络(GAN)的方法进行完善。所得算法形成一种神经增强,在相应的数据驱动方法可能不会收敛的同时,该算法的收敛非常快。通过使用估计的传输图和大气光以及KoschmiederLaw来恢复无雾图像。实验结果表明,所提出的算法可以从现实世界和合成的朦胧图像中井除雾霾。
translated by 谷歌翻译
基于模型的单幅图像脱水算法用尖锐的边缘和丰富的细节恢复图像,以牺牲低PSNR值。数据驱动的那些恢复具有高PSNR值的图像,但具有低对比度,甚至一些剩余的阴霾。在本文中,通过融合基于模型和数据驱动的方法来引入新颖的单图像脱水算法。通过基于模型的方法初始化透射图和大气光,并通过构成神经增强的深度学习方法来精制。通过使用传输地图和大气光来恢复无雾图像。实验结果表明,该算法可以从现实世界和合成朦胧图像中脱离雾度。
translated by 谷歌翻译
在低动态范围(LDR)图像中存在阴影和突出显示区域,其从高动态范围(HDR)场景捕获。恢复LDR图像的饱和区域是一个不成不良的问题。在本文中,通过融合模型和数据驱动的方法来恢复LDR图像的饱和区域。利用这种神经增强,首先通过基于模型的方法从底层LDR图像生成两个合成的LDR图像。一个比输入图像更亮,以恢复阴影区域,另一个比输入图像更暗,以恢复高光区域。然后通过新颖的曝光感知饱和度恢复网络(EASRN)改进了两个合成图像。最后,两个合成图像和输入图像通过HDR合成算法或多尺度曝光融合算法组合在一起。所提出的算法可以嵌入任何智能手机或数码相机,以产生信息丰富的LDR图像。
translated by 谷歌翻译
针对现有的单一图像雾度去除算法,其基于现有知识和假设,受到实际应用中的许多限制,并且可能遭受噪声和光晕放大。本文提出了端到端系统,以通过结合先前的知识和深度学习方法来减少缺陷。雾度图像首先通过加权引导图像滤波器(WGIF)分解到基础层和细节层中,并且从基层估计偶极。然后,基础层图像被传递到高效的深卷积网络,用于估计传输映射。为了在不放大天空或严重朦胧场景中完全放大噪声的情况下恢复接近相机的物体,基于传输映射的值提出自适应策略。如果像素的传输映射很小,则最终使用雾度图像的基层通过大气散射模型恢复无雾图像。否则,使用雾霾图像。实验表明,该方法对现有方法实现了卓越的性能。
translated by 谷歌翻译
由于其广泛的应用,模型驱动的单幅图像脱色在不同的前方之上被广泛研究。天空区域的物体光线和雾度与噪声放大之间的模糊性是模型驱动的单图像脱水的两个固有问题。在本文中,提出了先前(DDAP)的黑暗直接衰减,以解决前一个问题。提出了一种新的阴霾线平均来减少由DDAP引起的形态学伪像,其使加权引导图像过滤器能够进一步减少形态伪像,同时保留图像中的细结构。然后提出了一种通过采用拉普拉斯和瓜山金字塔将朦胧图像分解成不同水平并应用不同的雾度去除和降噪方法来解决后一种问题,以解决金字塔的不同级别的场景辐射。将得到的金字塔折叠以恢复无雾图像。实验结果表明,所提出的算法优于艺术脱水算法的状态,并且确实防止了噪声在天空区域中被放大。
translated by 谷歌翻译
Masked image modeling (MIM) performs strongly in pre-training large vision Transformers (ViTs). However, small models that are critical for real-world applications cannot or only marginally benefit from this pre-training approach. In this paper, we explore distillation techniques to transfer the success of large MIM-based pre-trained models to smaller ones. We systematically study different options in the distillation framework, including distilling targets, losses, input, network regularization, sequential distillation, etc, revealing that: 1) Distilling token relations is more effective than CLS token- and feature-based distillation; 2) An intermediate layer of the teacher network as target perform better than that using the last layer when the depth of the student mismatches that of the teacher; 3) Weak regularization is preferred; etc. With these findings, we achieve significant fine-tuning accuracy improvements over the scratch MIM pre-training on ImageNet-1K classification, using all the ViT-Tiny, ViT-Small, and ViT-base models, with +4.2%/+2.4%/+1.4% gains, respectively. Our TinyMIM model of base size achieves 52.2 mIoU in AE20K semantic segmentation, which is +4.1 higher than the MAE baseline. Our TinyMIM model of tiny size achieves 79.6% top-1 accuracy on ImageNet-1K image classification, which sets a new record for small vision models of the same size and computation budget. This strong performance suggests an alternative way for developing small vision Transformer models, that is, by exploring better training methods rather than introducing inductive biases into architectures as in most previous works. Code is available at https://github.com/OliverRensu/TinyMIM.
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
As one of the prevalent methods to achieve automation systems, Imitation Learning (IL) presents a promising performance in a wide range of domains. However, despite the considerable improvement in policy performance, the corresponding research on the explainability of IL models is still limited. Inspired by the recent approaches in explainable artificial intelligence methods, we proposed a model-agnostic explaining framework for IL models called R2RISE. R2RISE aims to explain the overall policy performance with respect to the frames in demonstrations. It iteratively retrains the black-box IL model from the randomized masked demonstrations and uses the conventional evaluation outcome environment returns as the coefficient to build an importance map. We also conducted experiments to investigate three major questions concerning frames' importance equality, the effectiveness of the importance map, and connections between importance maps from different IL models. The result shows that R2RISE successfully distinguishes important frames from the demonstrations.
translated by 谷歌翻译